Speaker Clustering Based on Utterance-Oriented Dirichlet Process Mixture Model

نویسندگان

  • Naohiro Tawara
  • Shinji Watanabe
  • Tetsuji Ogawa
  • Tetsunori Kobayashi
چکیده

This paper provides the analytical solution and algorithm of UO-DPMM based on a non-parametric Bayesian manner, and thus realizes fully Bayesian speaker clustering. We carried out preliminary speaker clustering experiments by using a TIMIT database to compare the proposed method with the conventional Bayesian Information Criterion (BIC) based method, which is an approximate Bayesian approach. The results showed that the proposed method outperformed the conventional one in terms of both computational cost and robustness to changes in tuning parameters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model

We have proposed a novel speaker clustering method based on a hierarchically structured utterance-oriented Dirichlet process mixture model. In the proposed method, the number of speakers can be determined from the given data using a nonparametric Bayesian manner and intra-speaker variability is successfully handled by multi-scale mixture modeling. Experimental result showed that the proposed me...

متن کامل

A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large scale data

An infinite mixture model is applied to model-based speaker clustering with sampling-based optimization to make it possible to estimate the number of speakers. For this purpose, a framework of non-parametric Bayesian modeling is implemented with the Markov chain Monte Carlo and incorporated in the utterance-oriented speaker model. The proposed model is called the utterance-oriented Dirichlet pr...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Infinite models for speaker clustering

In this paper we propose the use of infinite models for the clustering of speakers. Speaker segmentation is obtained trough a Dirichlet Process Mixture (DPM) model which can be interpreted as a flexible model with an infinite a priori number of components. Learning is based on a Variational Bayesian approximation of the infinite sequence. DPM model is compared with fixed prior systems learned b...

متن کامل

Supervised Learning of Acoustic Models in a Zero Resource Setting to Improve DPGMM Clustering

In this work we utilize a supervised acoustic model training pipeline without supervision to improve Dirichlet process Gaussian mixture model (DPGMM) based feature vector clustering. We exploit methods common in supervised acoustic modeling to unsupervisedly learn feature transformations for application to the input data prior to clustering. The idea is to automatically find mappings of feature...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011